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ABSTRACT 

The  origin  history,  current  status  and  problems  of  Data  Envelopment  Analysis  (DEA)  on  empirical 
multi-input,  multi-output  data  are  surveyed  in  relation  to  efficiency  valuation,  production  function 
determination  and  stochastic  frontier  estimation. 


KEYWORDS 

Data  Envelopment  Analysis,  Efficiency  Valuation,  Production  Functions,  Frontier  Estimation 


Aecesslon  *or _ 

"OTIS'  GRA4I 

a 

DTIC  TAB 

1  Unannounced 

□ 

justification — 

— - 

p  lEtrlbat  * — 

Availability  Code* 


- !  Avail 

Dlst  Special 


ORIGIN 


Data  Envelopment  Analysis  (DEA)  began  in  generalization  of  the  usual  scientific- 
engineering  efficiency  valuation  of  a  single  input,  single  output  system  as  the  ratio  of  the  output 
to  input  (in  the  same  physical  measure,  e.g.,  energy)  to  multi-input,  multi-output  systems  (or 
organizations  or  production  units)  without  known  "physical"  laws  or  the  same  measure  for  all 
inputs  and  outputs.  This  was  accomplished  by  (/)  reduction  of  the  multi-inputs  and  outputs  to 
single  "virtual"  inputs  and  outputs,  (h) replacing  absolute  efficiency  by  efficiency  relative  to  all 
members  of  a  sample  of  units  (called  DMU's)  having  the  same  inputs  and  outputs,  ( 'iii ) 
evaluating  a  unit's  relative  efficiency  as  the  maximum  of  the  ratio  of  its  virtual  output  to  virtual 
input  subject  to  virtual  outputs  being  less  than  or  equal  to  virtual  inputs  for  each  (all)  of  the 
DMU's. 

Evidently  there  are  infinitely  many  ways  to  construct  virtual  inputs  and  outputs.  One 
involves  multiplying  the  inputs  and  outputs  by  non-negative  "virtual  multipliers"  and  adding  to 
get  a  single  virtual  input  or  output.  Another  is  raising  them  to  non-negative  powers  and 
multiplying  them  together  to  get  a  virtual  input  or  output.  Out  of  (iii),  the  dual  mathematical 
programming  problem  arising  in  the  first  way  (Chames,  et  al.  1978)  reproduces  (and 
generalizes)  M.  Farrell's  efficiency  evaluation  (Farrell  1957)  based  on  a  "production  possibility 
set"  consisting  of  the  conical  hull  of  the  input-output  vectors  of  the  sample  points.  The 
economic  efficient  production  function  is  then  a  piecewise  linear  function  based  on  the  efficient 
DMU's  inputs  and  outputs.  The  second  "multiplicative"  way  (Chames  et  al.  1981, 1983)  leads 
to  piece-wise  Cobb- Douglas  functions. 


ECONOMIC  PRODUCTION  FUNCTIONS 

Farrell,  working  from  the  production  function  side,  thought  he  had  to  assume  constant 
returns  to  scale  to  obtain  his  (single  output)  results.  Others  like  R.  Shepard  thought  a  constant 
elasticity  assumption  was  needed  to  get  log-linear  (or  Cobb-Douglas)  results.  Coming  from 


the  "valuation"  side  via  the  dual  mathematical  programming  problem  to  the  "production 
function"  or  "DEA"  side,  Chames  et  al.  (1978, 1981, 1983)  showed  that  neither  assumption 
was  necessary  and  that  piece-wise  constancy  would  hold  for  returns  to  scale  respectively 
elasticity  in  the  efficient  economic  "empirical"  production  function  based  on  sample  data. 

In  Chames  et  al.  (1985a),  exploring  the  production  function  side,  it  was  shown  that  all 
(and  more)  "models"  for  testing  the  efficiency  of  DMU's  were  the  Chames-Cooper  test  (see 
Chames  and  Cooper  1961  and  Ben-Israel  et  al.  1977)  for  multi-criteria  ("goal  programming") 
optimality  here  specialized  to  "Pareto-Koopmans  efficiency"  relative  to  the  specified  production 
possibility  sets.  These  involve  envelopment  of  the  inputs  from  below  and  the  outputs  from 
above,  hence  the  name  Data  Envelopment  Analysis  (DEA)  for  all  "models"  of  such  efficiency 
type. 

The  first  way  started  with  the  "CCR  ratio"  form: 

T  T 

max  T)t  yjt,  Xo  with  T]t  yj/£  xj  <  1,  r\,  £  >  0,  j  =  1 . n  (1) 

where  yj  =  (y x[  =(xlj,...xinj), 

are  the  output  respectively  input  vectors  of  DMUj  assumed  positive  and  (x0,  yD)  is  the  pair  for 
DMU0,  one  °f  DMU's. 

To  eliminate  false  technical  efficiency  determinations  (recognized  by  Farrell)  stemming 
from  optimal  entries  of  T|  or  being  zero,  it  was  immediately  replaced  by  the  non- 
Archimedean  CCR  form: 

max  T|t  yo/^T  Xo  with  yj/^T  xj  <  1 ,  r[T/  x<,  >  £  eT  (2) 

£T/£T  Xo  >  e  eT,  j  =  l,...,n 

where  e  is  a  non- Archimedean  infinitesimal  and  the  eT  are  vectors  of  ones. 


Using  the  Chames-Cooper  transformation 


(3) 


jxT  =  t  riT,  uT  =  t4T,  t  =  feTXo) 1 
The  equivalent  dual  linear  programs  are: 

CCR  max  yQ  with  oT  x0  =  1,  HTY  -  t)TX  <  0,  |1T  >  €  eT,  uT  >  e  eT  (4.1) 

DEA  min  0  -  e  eT  s+  -  e  eT  s-  with  YX  -  s+  =  y0,  0Xo  -  XX  -  s-  =  0  (4.2) 

and  X,  s+.  S' >  0  where  Y  =  [yi,...,yn],  X  =[xi,...,xn] 

To  be  noted  is  that  Fare-Hunsaker  (Management  Science  February  1986)  present  them 
erroneously  with  erroneous  conclusions  and  all  examples  erroneously  solved.  (See  Chames  et 
al  1987.) 

Computation  is  done  on  the  DEA  side.  No  non-Archimedean  quantities  need  be  used, 
see  Chames  and  Cooper  (1961),  also  I.  Ali  (University  of  Massachusetts,  College  of 
Business),  J.  Stutz  (University  of  Miami,  Quantitative  Management  department)  and  DEA 
software  of  the  Center  for  Cybernetic  Studies.  See  also  Chames  et  al  (1986)  for  a  much  more 
complicated  Archimedean  approach  using  multiple  linear  programs  for  each  efficiency 
determination. 

Also  in  Chames  et  al  (1985a)  is  a  most  useful  DEA  model,  today  called  the  "additive" 

model 


min  -eT  s+  -  eT  s*  with  YX  -  s+  =  y0,  -XX  -  s-  =  x<,  (5) 

eT  X  =  1  and  X,  s+,  s_  >  0 

Interestingly,  by  taking  logs  of  the  virtual  input-output  vectors  in  the  multiplicative  model,  it 
reduces  to  this  form. 


To  insure  that  the  efficiency  determined  in  the  additive  model  is  independent  of  the  units 
of  measurement  of  the  inputs  and  outputs,  the  s+  and  s*  in  (5)  can  be  replaced  by  s+,  s'  with 
s^  =  sf/yr  o  and  So  =  S'Jxi  c,  r  =  l,...s,  i  =  l,...m.  This  also  improves  numerical  stability  in 
the  calculations. 

To  allow  for  the  important  possibilities  of  thresholds  on  possible  inputs  and  ceilings  on 
possible  outputs,  the  "extended  additive"  model,  see  Chames  et  al  (1987a),  puts  individual 
bounds  on  the  DEA  side  "slacks"  which  do  not  require  additional  rows  of  constraints  in  usual 
LP  software. 

In  all  the  above  models,  each  inefficient  DMU  determination  provides  it  with  a  "facet" 
of  similar  efficient  DMU's  which  is  the  convex  hull  of  the  DMU's  with  zero  "reduced  costs"  in 
an  optimal  basic  simplex  tableau  for  the  inefficient  DMU  problem.  These  facets  are  the  pieces 
of  the  empirical  efficient  production  function  on  which  the  function  is  linear  (log-linear  in  the 
multiplicative  case).  The  union  of  the  input  sets  of  the  facets,  however,  is  often  not  that  of  the 
desired  input  set  for  the  production  possibility  set.  It  is  an  open  problem  to  extend  the  function 
to  all  of  this  set  i.e.  how  best  to  estimate  an  approximation  to  what  a  corresponding  efficient 
output  to  each  input  therein  might  be. 


EFFICIENCY  VALUATION 

Every  DEA  analysis  involves  suitable  selection  of  inputs  and  outputs  to  assure  a 
reasonable  production  function.  Also  required  are  sufficient  sample  data.  Sometimes  proper 
inputs-outputs  or  sample  data  are  unavailable.  Sometimes  the  objective  is  only  to  determine  a 
single  (or  few)  "most"  efficient  DMU. 

For  example,  Thompson  &  Thrall  (1986)  determined  Waxahachie,  Texas  to  be  the 
"most  efficient"  site  for  the  Super  Collider  of  six  possible  locations. 


Starting  with  4  inputs,  4  outputs,  6  DMU's  and  employing  the  CCR  ratio  model,  5 
DMU's  were  efficient.  By  placing  additional  restrictions  on  pairs  of  virtual  multipliers  (called 
"assurance  regions")  i.e.  by  requiring  that  the  relative  valuations  of  certain  inputs  or  outputs 
were  in  specified  ranges,  their  new  DEA  model  recognized  only  one  DMU,  Waxahachie,  to  be 
efficient.  This  means,  however,  that  the  corresponding  efficient  production  function  so 
determined  consists  only  of  all  positive  multiples  of  the  Waxahachie  input-output  vector. 

The  restrictions  on  the  virtual  multipliers  placed  them  in  cones  which  were  the 
intersection  of  half-spaces  with  the  non-negative  orthant.  Working  with  the  Pareto-optimality 
or  multi-criteria  (or "dominance")  DEA  basis  and  the  dual  convex  programming  forms  of  Ben- 
Israel  et  al  (1971)  with  one  side  variables  in  a  closed  convex  cone  and  the  dual  side  variables  in 
the  (negative)  polar  cone,  Chames  et  al  (1987b)  generalized  the  CCR  ratio  model  of  (1)  to  a 
"cone-ratio"  model  which  with  trivial  extension  includes  all  assurance  region  embellishments 
and  which  does  not  require  the  cones  to  be  given  as  intersections  of  half-spaces. 

Trying  to  determine  a  more  objective  measure  of  managerial  performance  of  bank 
managers  from  Call  report  data  in  D.  B.  Sun's  Ph.D.  thesis,  the  CCR  ratio  form  rated  two 
notoriously  inefficient  banks  (in  particular  years)  as  efficient  (see  Chames  et  al  1988).  A  cone- 
ratio  model  with  virtual  multiplier  cones  as  (he  conical  hull  of  the  CCR  optimal  virtual 
multipliers  of  3  banks  unanimously  top  rated  by  bank  experts  was  essayed  It  correctly  rated 
the  notorious  ones  as  inefficient 

In  this  "sum"  form,  computation  is  reduced  to  the  old  CCR  computation  with  input- 
output  matrix  multiplied  by  the  matrix  of  the  old  optimal  virtual  multipliers  of  the  selected  top 
rated  DMU's.  Thus  no  new  major  software  is  required.  All  "intersection"  form  cones  for 
assurance  regions  can  be  transformed  by  a  matrix  multiplication  into  "sum"  form.  From  sum 
to  intersection  form,  the  half-spaces  are  often  more  complicated  than  assurance  regions  (see 
Chames  et  al  1989). 


STOCHASTIC  ASPECTS  OF  DEA 


Every  DEA  analysis  involves  sample  data  of  inputs  and  outputs  which  arc  converted  by 
definite  mathematical  operations  into  other  quantities.  By  definition  such  quantities  are 
"statistics."  Therefore  every  DEA  "model"  is  a  stochastic  model.  Since,  however,  the 
distribution  functions  of  managerial  performance  at  the  different  DMU's  is  unknown  we  lack 
appropriate  statistical  theory  for  our  real  statistical  structures.  Development  of  such  theory  and 
appropriate  computation  is  a  major  task  for  DEA  research. 

The  current  state  of  progress  is  perhaps  best  evaluated  in  Jati  Sengupta’s  outstanding 
1989  monograph.  Efficiency  Analysis  bv  Production  Frontiers:  The  Non-Parametric 
Approach  which  surveys  and  develops  some  DEA  models  in  relation  to  past  and  current 
econometric  concepts.  These  "risk"  elaborations  (i.e.  known  statistical  distributions)  are 
almost  entirely  for  single  output  situations  and  fail  to  consider  appropriately  the  "waste" 
resident  in  inefficient  "uncertain"  managerial  performance. 

At  a  minimum,  since  efficient  production  function  pieces  have  numbers  of  parameters 
equal  to  the  sum  of  the  number  of  inputs  plus  outputs,  one  should  have  at  least  3  times  more 
DMU's  than  this  sum.  Practically  this  has  been  accomplished  in  real  studies  with  DMU 
observations  over  multiple  time  periods  by  "window  analysis."  (See  Chames  et  al  1985b.) 
There  a  DMU  in  each  different  period  of  a  "window"  of  periods  is  treated  as  if  it  were  a 
different  DMU.  The  same  DMU  in  say  three  successive  periods  is  treated  as  three  DMU's, 
thus  tripling  the  number  of  DMU's  in  the  sample  window.  Then  the  window  is  moved  ahead 
one  period  from  the  old  start  and  an  analysis  is  done  on  the  new  window,  and  so  forth.  From 
the  pattern  of,  say,  efficiency  scores  for  each  particular  DMU  a  good  deal  of  practical 
information  on  stochastic  variability  is  at  hand.  E.g.  a  drastic  change  in  the  score  for  a  real 
DMU  at  a  particular  time-period  across  the  windows  is  a  strong  signal  that  something  unusual 
was  happening  in  that  DMU  at  that  time  period  which  should  be  investigated. 

Again,  taking  the  median  of  the  scores  through  the  windows  at  each  time  period  for  a 
DMU  gives  a  reasonable  temporal  estimate  of  the  efficiency  performance  of  the  DMU.  The 


totality  for  all  DMU's  gives  then  the  temporal  pattern  of  efficiency  performance  of  them  all.  It 
goes  without  saying  that  there  is  as  yet  no  time  series  theory  developed  for  such  constructs. 
Another  useful  tool  is  the  "envelopment  map",  a  matrix  (ajj)  which  records  the  number  of  times 
DMUj  is  a  facet  generator  for  DMUj .  By  summing  the  columns  one  can  have  instant 
determination  of  which  DMU's  are  most  (or  least)  consistently  efficient. 

These  window  analysis  techniques  can  also  be  applied  to  other  quantities  than 
efficiency  scores  e.g.,  rates  of  change  of  a  particular  output  with  respect  to  a  particular  input, 
which  would  be  important  in  specification  of  and  temporal  analysis  of  the  relative  effectiveness 
of  the  different  DMU's  for  this  output  with  this  input. 

Again,  only  pieces  of  the  efficient  empirical  production  function  are  determined  and 
these,  stochastically,  with  robustness  corresponding  to  the  number  of  DMU's  enveloped  by  the 
associated  facet  As  mentioned,  additional  means  for  production  function  estimation  across  at 
least  the  whole  desired  production  possibility  set  domain  (of  inputs)  are  important  to  determine. 

Despite  these  challenging  research  problems,  DEA  has  proved  itself  as  a  powerful  tool 
for  investigation  of  real  managerial  or  production  situations  and  with  the  most  unusual  feature 
of  developing  assessments  applicable  to  the  individual  productive  units  instead  of  averages 
across  the  mass  which  are  in  error  for  every  individual. 
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